Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement delete_objs in fs and s3 storage backends. #395

Merged
merged 6 commits into from
Aug 20, 2021

Conversation

zijie0
Copy link
Contributor

@zijie0 zijie0 commented Aug 18, 2021

Description

Implement delete_objs in fs and s3 storage backends. Add corresponding test cases.

Related Issue(s)

Documentation

@zijie0 zijie0 changed the title Implement delete_objs in fs ans s3 storage backends. Implement delete_objs in fs and s3 storage backends. Aug 18, 2021
@Dandandan
Copy link
Contributor

Cool @zijie0 i see you were ahead of me.

#396

Maybe there is some inspiration to take from the other PR

@@ -134,6 +134,17 @@ impl StorageBackend for FileStorageBackend {
async fn delete_obj(&self, path: &str) -> Result<(), StorageError> {
fs::remove_file(path).await.map_err(StorageError::from)
}

async fn delete_objs(&self, paths: &[String]) -> Result<(), StorageError> {
Copy link
Contributor

@Dandandan Dandandan Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think paths: &[&str] would be a slightly better signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally used &[&str] but later met some problem when calling this method from vacuum... It said that: temporary value created here. So I think maybe we should use String instead of &str here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, I think the method should be defined as a generic function taking impl Iterator<AsRef<str>> as input. But this trait is being used as trait object, so we can't use generic here, which makes things really annoying. I also think &[&str] would have been a cleaner interface, but it will force us to allocate both Vec<String> and Vec<&str> in vacuum method, which has performance overheads :( I also don't really have a good suggestion on how to workaround this issue here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dandandan Do you have any suggestion on this? Or we could go ahead with &[String] for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine 👍

rust/src/storage/mod.rs Outdated Show resolved Hide resolved
rust/Cargo.toml Outdated Show resolved Hide resolved
@Dandandan
Copy link
Contributor

I added some comments @zijie0

@zijie0
Copy link
Contributor Author

zijie0 commented Aug 18, 2021

@Dandandan Thank you for your suggestion, learned a lot from your PR :)

* Batch delete in chunk.
* Add default implementation for `delete_objs`.
@rtyler rtyler added binding/rust Issues for the Rust crate enhancement New feature or request storage/aws AWS S3 storage related labels Aug 18, 2021
@Dandandan
Copy link
Contributor

I think this is a great improvement.

I am wondering a bit how errors in the S3 implementation are handled, as the notfound error is not explicitly raised.

But I don't think that should be covered in this PR.

@houqp houqp merged commit 842d526 into delta-io:main Aug 20, 2021
@houqp
Copy link
Member

houqp commented Aug 20, 2021

Thank you both for this new optimization @zijie0 @Dandandan !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate enhancement New feature or request storage/aws AWS S3 storage related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Leverage batch delete in vacuum
4 participants